SRI's Submissions to Chinese-English PatentMT NTCIR10 Evaluation

نویسندگان

Bing Zhao

Jing Zheng

Wen Wang

Nicolas Scheffer

چکیده

The SRI team joined the subtask of Chinese-English Patent machine translation evaluation, and submitted the translation results using a combined output from two types of grammars supported in SRInterp [13], with two different word segmentations. We investigated the effect of adding sparse features, together with several optimization strategies. Also,for the PatentMT domain, we carried out preliminary experiments on adapting language models. Our results showed positive improvements using these approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SRI Submissions to Chinese-English PatentMT NTCIR10

The SRI team joined the subtask of Chinese-English Patent machine translation evaluation, and submitted the transla tion results using a combined output from two types of gram mars supported in SRlnterp, with two different word seg mentations. We investigated the effect of adding sparse fea tures, together with several optimization strategies. Also,for the PatentMT domain, we carried out pr...

متن کامل

The HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10

We describe the statistical machine translation (SMT) systems developed at Heidelberg University for the Chinese-toEnglish and Japanese-to-English PatentMT subtasks at the NTCIR10 workshop. The core system used in both subtasks is a combination of hierarchical phrase-based translation and discriminative training using either large feature sets and `1/`2 regularization (for Japanese-to-English) ...

متن کامل

NTT-NII Statistical Machine Translation for NTCIR-10 PatentMT

This paper describes details of the NTT-NII system in NTCIR10 PatentMT task. The system is an extension of the NTTUT system in NTCIR-9 by: a new English dependency parser (for EJ task), a syntactic rule-based pre-ordering (for JE task), a syntax-based post-ordering (for JE task). Our system ranked 1st in EJ subtask both in automatic and subjective evaluation, and was the best SMT system in JE s...

متن کامل

Using Parallel Corpora to Automatically Generate Training Data for Chinese Segmenters in NTCIR PatentMT Tasks

Chinese texts do not contain spaces as word separators like English and many alphabetic languages. To use Moses to train translation models, we must segment Chinese texts into sequences of Chinese words. Increasingly more software tools for Chinese segmentation are populated on the Internet in recent years. However, some of these tools were trained with general texts, so might not handle domain...

متن کامل

The TRGTK's System Description of the PatentMT Task at the NTCIR-10 Workshop

This paper introduces the TRGTK’s system for Patent Machine Translation at the NTCIR-10 Workshop. In this year’s program, we participate Chinese-English, English-Japanese and Japanese-English three subtasks. We submit required system results for Intrinsic Evaluation (IE), Patent Examination Evaluation (PEE), Chronological Evaluation (ChE), and Multilingual Evaluation (ME). Different from last y...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

SRI's Submissions to Chinese-English PatentMT NTCIR10 Evaluation

نویسندگان

چکیده

منابع مشابه

SRI Submissions to Chinese-English PatentMT NTCIR10

The HDU Discriminative SMT System for Constrained Data PatentMT at NTCIR10

NTT-NII Statistical Machine Translation for NTCIR-10 PatentMT

Using Parallel Corpora to Automatically Generate Training Data for Chinese Segmenters in NTCIR PatentMT Tasks

The TRGTK's System Description of the PatentMT Task at the NTCIR-10 Workshop

عنوان ژورنال:

اشتراک گذاری